OSTG | Eclipse TechForgeThinkGeekSlashdotITMJLinux.comNewsForgefreshmeatNewslettersPriceGrabberJobsBroadbandWhitepapers
SourceForge.net Home      
    my sf.net     |     software map     |     donate to sf.net     |     about sf.net    
welcomes dhns (Logout)
We remember you
Register New Project
   Search

   
results by YAHOO! search
 
   SF.net Subscription
 · Subscribe Now
 · Manage Subscription
 · Realtime Statistics
 · Direct Download
 · Priority Tech Support
 · Project Monitoring
 
   SF.net Resources
 · Site Docs
 · Site Status (07/29)
 · SF.net Supporters
 · Compile Farm
 · Project Help Wanted
 · New Releases
 · SF.net Engineer Blog
 · Get Support
 
   Site Sponsors







 
Most Active
1 Azureus - BitTorrent ClientAccepting Donations
2 Gaim
3 Compiere ERP + CRM Business Solution
4 InkscapeAccepting Donations
5 SNAP Platform and SNAPPIXAccepting Donations
6 phpMyAdminAccepting Donations
7 7-Zip
8 PDFCreatorAccepting Donations
9 FCKeditor
10 FileZillaAccepting Donations

More Activity>>

Top Downloads
1 Azureus - BitTorrent ClientAccepting Donations
2 eMule
3 BitTorrent
4 NASA World Wind
5 Shareaza
6 DC++
7 GTK+ and The GIMP installers for Windows
8 VirtualDub
9 7-Zip
10 FileZillaAccepting Donations

More Statistics>>

   SF.net Services

 · Jobs
 · PriceGrabber
 · Whitepapers
 · Partner Product Offers
 · Get Broadband
 · IT Product Guide
 
   Sponsored Content
 

   Project: open geo coordinates database: Document Manager: Display Document


Summary |  Admin |  Home Page |  Forums |  Tracker |  Lists |  Tasks |  Docs |  News |  CVS |  Files | 

Submit new documentation | View Documentation | Admin

1)Introduction to the database scheme 2)The relational tables in details 3)The praxis of SQL in opengeodb 3.1)Introduction: All names of towns mit their area code 3.2)For advanced people: Area codes and towns of Liechtenstein country 3.3)Much too simple: All Cantones of Switzerland 4)Special cases and other notes 4.1)- Dates giving the validity of entries 4.2)- Special attributes in geodb_textdata 4.3)- geodb_coordinates is still beta



1) INTRODUCTION TO THE DATABASE SCHEME:

The opengeodb database consists mainly of the relational tables (Tabellen) geodb_locations, geodb_hierarchies, geodb_coordinates and geodb_textdata currently.

The other tables named geodb_floatdata, geodb_intdata, geodb_areas, geodb_polygons and geodb_type_names do not (yet) play any (important) role.


2) THE RELATIONAL TABLES IN DETAILS:

geodb_locations:
      create table geodb_locations (
        loc_id               serial primary key,
        loc_type             integer not null
          check (loc_type = CONTINENT or loc_type = STATE or
                 loc_type = NUTS_I or loc_type = NUTS_II or
                 loc_type = NUTS_III or loc_type = POL_DIVISION or
                 loc_type = POPULATED_AREA or loc_type = LOC_AREA_CODE)
      );
geodb_locations is the main relation where any location has to have an entry. It defines an attribute named loc_id meaning a unique identifier, identifying this locations in all others tables as well.

The second attribute is named loc_type, and it should make clear, what kind of entry we can expect here: Are the geo-coordinates and data etc. related to a town or to an area code or whatever.

geodb_hierarchies:
    create table geodb_hierarchies (
      loc_id               integer not null references geodb_locations,
      level                integer not null check (level>0 and level<=9),
      id_lvl1              integer not null,
      id_lvl2              integer,
      id_lvl3              integer,
      id_lvl4              integer,
      id_lvl5              integer,
      id_lvl6              integer,
      id_lvl7              integer,
      id_lvl8              integer,
      id_lvl9              integer,
      valid_since          date,
      date_type_since      integer,
      valid_until          date not null,
      date_type_until      integer not null
      check (
        (
          (level = 1 and /* loc_id = id_lvl1 and */
                         id_lvl2 is null and id_lvl3 is null and
                         id_lvl4 is null and id_lvl5 is null and
                         id_lvl6 is null and id_lvl7 is null and
                         id_lvl8 is null and id_lvl9 is null) or
          (level = 2 and /* loc_id = id_lvl2 and */
                         id_lvl1 is not null and id_lvl3 is null and
                         id_lvl4 is null and id_lvl5 is null and
                         id_lvl6 is null and id_lvl7 is null and
                         id_lvl8 is null and id_lvl9 is null) or
          (level = 3 and /* loc_id = id_lvl3 and */
                         id_lvl1 is not null and id_lvl2 is not null and
                         id_lvl4 is null and id_lvl5 is null and
                         id_lvl6 is null and id_lvl7 is null and
                         id_lvl8 is null and id_lvl9 is null) or
          (level = 4 and /* loc_id = id_lvl4 and */
                         id_lvl1 is not null and id_lvl2 is not null and
                         id_lvl3 is not null and id_lvl5 is null and
                         id_lvl6 is null and id_lvl7 is null and
                         id_lvl8 is null and id_lvl9 is null) or
          (level = 5 and /* loc_id = id_lvl5 and */
                         id_lvl1 is not null and id_lvl2 is not null and
                         id_lvl3 is not null and id_lvl4 is not null and
                         id_lvl6 is null and id_lvl7 is null and
                         id_lvl8 is null and id_lvl9 is null) or
          (level = 6 and /* loc_id = id_lvl6 and */
                         id_lvl1 is not null and id_lvl2 is not null and
                         id_lvl3 is not null and id_lvl4 is not null and
                         id_lvl5 is not null and id_lvl7 is null and
                         id_lvl8 is null and id_lvl9 is null) or
          (level = 7 and /* loc_id = id_lvl7 and */
                         id_lvl1 is not null and id_lvl2 is not null and
                         id_lvl3 is not null and id_lvl4 is not null and
                         id_lvl5 is not null and id_lvl6 is not null and
                         id_lvl8 is null and id_lvl9 is null) or
          (level = 8 and /* loc_id = id_lvl8 and */
                         id_lvl1 is not null and id_lvl2 is not null and
                         id_lvl3 is not null and id_lvl4 is not null and
                         id_lvl5 is not null and id_lvl6 is not null and
                         id_lvl7 is not null and id_lvl9 is null) or
          (level = 9 and /* loc_id = id_lvl9 and */
                         id_lvl1 is not null and id_lvl2 is not null and
                         id_lvl3 is not null and id_lvl4 is not null and
                         id_lvl5 is not null and id_lvl6 is not null and
                         id_lvl7 is not null and id_lvl8 is not null)
          ) and
          (
            (valid_since is null and date_type_since is null) or
            (valid_since is not null and date_type_since is not null)
          )
      )
    );
geodb_hierarchies contains the hierarchical (== political) structures, the location is bound to. For this we have nine attributes named id_lvl1 to id_lvl9 containing the loc_ids of all nine political levels. See for example the town of Karwitz (Germany) with the loc_id 27431:

     104 105 116 176 351 19122 27431 null null

  id_lvl1 (104):    Europe
  id_lvl2 (105):    Germany
  id_lvl3 (116):    Lower Saxony
  id_lvl4 (176):    Regierungsbezirk Lüneburg
  id_lvl5 (351):    Landkreis Lüchow-Dannenberg
  id_lvl6 (19122):  Samtgemeinde Dannenberg (Elbe)
  id_lvl7 (27431):  Karwitz
  id_lvl8 (null):   -
  id_lvl9 (null):   -
The attribute "level" denotes the level of the locations itself, in this case it would have to be level seven. A level of six describes "normal" (self-administered) towns, a level of seven describes parts of towns. An entry with a level of two would mean a normal state like Germany or US or UK or India.

geodb_coordinates, geodb_textdata, geodb_floatdata, geodb_intdata:
    create table geodb_coordinates (
      loc_id               integer not null references geodb_locations,
      lon                  double precision,
      lat                  double precision,
      sin_lon              double precision,
      sin_lat              double precision,
      cos_lon              double precision,
      cos_lat              double precision,
      coord_type           integer not null check (coord_type=WGS84),
      coord_subtype        integer,
      valid_since          date,
      date_type_since      integer,
      valid_until          date not null,
      date_type_until      integer not null
    );

    create table geodb_textdata (
      loc_id               integer not null references geodb_locations,
      text_val             varchar(255) not null, /* varchar(2000)? */
      text_type            integer not null,
      text_locale          varchar(5), /* ISO 639-1 */
      is_native_lang       smallint(1),
      is_default_name      smallint(1),
      valid_since          date,
      date_type_since      integer,
      valid_until          date not null,
      date_type_until      integer not null,
        check (
          (
            (
              (text_type = NAME    or text_type = NAME_7BITLC    or
               text_type = NAME_VG or text_type = NAME_VG_7BITLC or
               text_type = SOURCE  or text_type = COMMENT
              ) and
              text_locale like '__%' and
              is_native_lang is not null and
              is_default_name is not null
            ) or
            (
              (text_type = ISO_3166_1_ALPHA_2 or text_type = ISO_3166_2 or
               text_type = AREA_CODE          or text_type = CAR_LICENSE_CODE or
               text_type = CAR_LIC_CODE_NAME
              ) and
              text_locale is null and
              is_native_lang is null and
              is_default_name is null
            )
          ) and
            (
              (valid_since is null and date_type_since is null) or
              (valid_since is not null and date_type_since is not null)
            )
        )
    );

    create table geodb_intdata (
      loc_id               integer not null references geodb_locations,
      int_val              bigint not null,
      int_type             integer not null,
      int_subtype          integer,
      valid_since          date,
      date_type_since      integer,
      valid_until          date not null,
      date_type_until      integer not null
    );

    create table geodb_floatdata (
      loc_id               integer not null references geodb_locations,
      float_val            double precision not null,
      float_type           integer not null,
      float_subtype        integer,
      valid_since          date,
      date_type_since      integer,
      valid_until          date not null,
      date_type_until      integer not null
    );
In these tables you will find any data related to the location, maybe the name or the population or whatever may be available.

While geodb_coordinates basically just contains the longitude and latitude values of the locations, the three following tables are "universal containers" being able to be stuffed with any data fitting into the related format (text / floating point numbers / integers).

The real content is determined by the type attribute in the table, which is text_type, float_type or int_type.

If you like to read the name of a location, you will need a query like:

    SELECT text_val
    FROM geodb_textdata
    WHERE text_type=500100000 /* NAME */ AND
          loc_id=27431;
Currently, geodb_textdata knows about the text types:
  • name
  • name in 7 bit ASCII and small caps (e.g. for sorting)
  • name of an administrative cooperation in Germany
  • same thing, but again 7 bit etc.
  • ISO codes for countries and provinces
  • area codes
  • car license codes
  • official code string for communities in Germany (AGS = Amtlicher Gemeindeschlüssel)
  • source of data (only tests so far)
  • comments (only tests so far)

geodb_floatdata does not yet contain any data, one could think of the size of the area of a location or else.

geodb_intdata just contains some test data, in this case, we have some population data of maybe 10 to 50 data sets. Other data like the altitude of a location are possible.

geodb_type_names:
    create table geodb_type_names (
      type_id              integer not null,
      type_locale          varchar(5) not null,
      name                 varchar(255) not null,
    unique (type_id, type_locale)
    );
The table is supposed to contain names for the text_type, float_type etc. attributes in geodb_textdata etc.. It not yet quite clear, if it is really meaningful to have such information in the database. Currently, we just have some (incomplete) data in it.

geodb_areas, geodb_polygons:
These tables do not contain any data so far. Probably the layout of these tables will change a little or a little more, before we include our first data. The meaning of these tables is to store vector data in them. The most wanted data like this are currently the border lines of states and provinces, but it is not limited to anything.


3) THE PRAXIS OF SQL IN OPENGEODB:

3.1) INTRODUCTION: All names of towns mit their area code

The above SQL sample just shows a very limited query that will normally not be sufficient.

So we do some more elaborate tests here. Let us start with the names of all locations together with there area codes:

    SELECT code.text_val as "area code", name.text_val as "town"
    FROM geodb_textdata code, geodb_textdata name
    WHERE name.loc_id=code.loc_id AND
          code.text_type=500300000 /* AREA_CODE */ AND
          name.text_type=500100000 /* NAME */
    ORDER by 2;
Firstly, wie need TWO DIFFERENT data sets from geodb_textdata: one to read the area code and another one to read the name of the location. For this we have to have geodb_textdata twice in the FROM section. To differentiate both of them (which is necessary anyway), we name them "code" and "name" here.

From the first geodb_textdata we read the area code:

    SELECT code.text_val as "area code" [...] FROM geodb_textdata code
From the second geodb_textdata we read the name of the location:
    SELECT [...], name.text_val as "town" FROM [...], geodb_textdata name
Obviously, the area code and the name should be for one and the same location. This location is uniquely identified by the loc_id as mentioned above, so we need an additional clause in the WHERE section:

 ... WHERE name.loc_id=code.loc_id [...]
Furthermore, the table named "code" is supposed to return area codes only, so we add a constraint for the text_type:
 ... WHERE [...] AND code.text_type=500300000 /* area code */ [...]
In the same way we only want names from the second geodb_textdata table to be returned, so we write:
 ... WHERE [...] AND name.text_type=500100000 /* NAME */

3.2) FOR ADVANCED PEOPLE: Area codes and towns of Liechtenstein country

Ok, this might have been a little too many locations, so how about a restriction on a very small European country called "Liechtenstein"?

In this case, we have to look into geodb_hierarchies, as this is the only place, the political hierarchy can be known of.

Let us remember: the attributes of geodb_hierarchies include:

    loc_id level id_lvl1, which is the loc_id of the continent in which this location is to be found id_lvl2: the state of the location id_lvl3: the province id_lvl4: a high order administrative division, also called NUTS II in Europe id_lvl5: the county id_lvl6: a town id_lvl7 to id_lvl9: parts of a town

So, we first start with a restriction for the country "Liechtenstein", this means, we restrict like this:

 ... WHERE id_lvl2=???
Ok, but what value should id_lvl2 be? Or let us ask the other way round: What is the loc_id of a town in Liechtenstein? We can look for the name "Fürstentum Liechtenstein" or we can as well look for the ISO code "FL", as this is the code for Liechtenstein:

    SELECT [...] geodb_textdata.loc_id
    FROM geodb_textdata
    WHERE text_val='FL' AND
              text_type=500100001 /* ISO Code */
Or let us put both ideas together now:

    SELECT ...
    FROM geodb_hierarchies hi, geodb_textdata state
    WHERE hi.id_lvl2=state.loc_id AND
          state.text_val='FL' AND
          state.text_type=500100001 /* ISO 3166 */
We like to have the name of the towns as well (from geodb_textdata again), but we cannot re-use geodb_textdata state, as this will give us the name of a state ONLY. So we add another geodb_textdata, let us call it town. This town obviously should be restraint to any location found by the restraints so far, so we join it to the loc_id's, already found in geodb_hierarchies:

 ... WHERE [...] ort.loc_id = hi.loc_id AND
          town.text_type = 500100000 /* NAME */
And now everything in one place:

    SELECT town.text_val
    FROM geodb_hierarchies hi, geodb_textdata state, geodb_textdata town
    WHERE hi.id_lvl2=state.loc_id AND
          state.text_val='FL' AND
          state.text_type=500100001 /* ISO 3166 */ AND
          town.loc_id = hi.loc_id AND
          town.text_type = 500100000 /* NAME */
We are still missing the area codes, so we still add a geodb_textdata table to the query, also joining it to the so far found loc_id's and again restraining it to the wanted text_type, in this case the area code:

    SELECT code.text_val, town.text_val
    FROM geodb_hierarchies hi,
         geodb_textdata state,
         geodb_textdata town,
         geodb_textdata code
    WHERE hi.id_lvl2=state.loc_id AND
          state.text_val='FL' AND
          state.text_type=500100001 /* ISO 3166 */ AND
          town.loc_id = hi.loc_id AND
          town.text_type = 500100000 /* NAME */ AND
          code.loc_id = town.loc_id AND
          code.text_type = 500300000 /* AREA CODE */
So, there is some space left for improvements in the appearance of the output:

    SELECT code.text_val as "Areacode",
           concat(concat(state.text_val,'-'),town.text_val) as "Name"
    FROM geodb_hierarchies hi,
         geodb_textdata state,
         geodb_textdata town,
         geodb_textdata code
    WHERE hi.id_lvl2=state.loc_id AND
          state.text_val='FL' AND
          state.text_type=500100001 /* ISO 3166 */ AND
          town.loc_id = hi.loc_id AND
          town.text_type = 500100000 /* NAME */ AND
          code.loc_id = town.loc_id AND
          code.text_type = 500300000 /* AREA CODE */
    ORDER BY 2
(Postgres users have to replace any "concat" by "textcat")

Is it difficult? Well, I would say, it is somehow "complex", but not really difficult...

3.3) MUCH TOO SIMPLE: All Cantones of Switzerland

Ok, but??? We just did this one? -- Yes, you are right!

There are some small differences: we like to output the ISO code instead of the area codes (there are no area codes for cantones), we are looking for the text "Switzerland" with the text_type NAME (which is 5001000000) instead of for the ISO name "FL":

    SELECT iso.text_val as "ISO",
           name.text_val as "Name"
    FROM geodb_hierarchies hi,
         geodb_textdata state,
         geodb_textdata name,
         geodb_textdata iso
    WHERE hi.id_lvl2=state.loc_id AND
          state.text_val='Switzerland' AND
          state.text_type=500100000 /* NAME */ AND
          name.loc_id = hi.loc_id AND
          name.text_type = 500100000 /* NAME */ AND
          iso.loc_id = name.loc_id AND
          iso.text_type = 500100001 /* ISO 3166 */
Unfortunately, we not only have the cantones of Switzerland, but Switzerland itself as well. Actually, when we want the cantones only, we have to say so: we might want to add something like:

    WHERE ... AND level=3 ...
but we can do it more easily, when we don't look for name.loc_id = hi.loc_id, but look for name.loc_id = hi.id_lvl3:

    SELECT iso.text_val as "ISO",
           name.text_val as "Name"
    FROM geodb_hierarchies hi,
         geodb_textdata state,
         geodb_textdata name,
         geodb_textdata iso
    WHERE hi.id_lvl2=state.loc_id AND
          state.text_val='Switzerland' AND
          state.text_type=500100000 /* NAME */ AND
          name.loc_id = hi.id_lvl3 AND
          name.text_type = 500100000 /* NAME */ AND
          iso.loc_id = name.loc_id AND
          iso.text_type = 500100001 /* ISO 3166 */

4) SPECIAL CASES AND OTHER NOTES:

4.1) Dates giving the validity of entries:

All data do have two dates specifying the range of time, when this entry is valid. They can be found in almost any table in the attributes valid_since, date_type_since, valid_until and date_type_until.

date_type_since / date_type_until give the information, how exact this date is to be interpreted. 2000-01-01 might mean the exact date on first of January or it might mean the year 2000 or at some day in the month of January 2000. We cannot tell from valid_since / valid_until, as they will always contain an exact date. A special date type is called UNKNOWN_FUTURE_DATE (300500000) and means that the validity of the entry has not yet expired.

UNKNOWN_FUTURE_DATE has ALWAYS the date 3000-01-01, so if one wants to query for current data only, one would add something like this:

 ... WHERE valid_until >= [today]
One exception is found when querying for population data, as they are related to time stamps and not to time ranges. I have still not made up my mind about how to input the dates in these cases. This is not yet too problematic, as population data is still more or less a test case.

4.2) Special attributes in geodb_textdata:

To allow for multiple different names of a location (e.g. München / Munich in Germany or Belgium / België /Belgique), we have another attribute in geodb_textdata with the name text_locale.

Quite often, one does not care for many different names, but just for the names used in that location. So we added a boolean attribute is_native_lang with true (or 1) meaning: this name is used in this location.

Multiple names generate still more problems. Often (or even mostly) one just wants to have one ("the best fitting") name for that location. Returning many names can complicate an application program VERY much. For this we included an attribute is_default_name. It is garantueed that this attribute is either null (for certain text_type values) or exactly one time true for other text_type values like NAME.

By this you can garantuee with:

 ... AND is_default_name = 1
on text_type=NAME etc. that you will get back only one name, and this is the most reasonable name in a universal sense. I think it is garantueed as well that is_native_lang is true as well in this case.

As there are some situations, where there is no really perfect solution for a default name (take "Europe" for example knowing maybe 20 or so "native" names), we will revert to the english name in such cases.

This will give you another problem: say, you are hosting a slovenic web page. Querying for: ... is_default_name = 1 will result in the english name 'Europe', even though the slovenic web page would want 'Evropa' as name.

In this case one would have to revert to an exclusive-or constraint. This would allow for the slovenic language, IF there is an entry in this language (AND IF it is a native language for that location) OR IF this is not the case, revert to the default name.

Exclusive-or conditions do not belong to an earlier SQL-standardization, so it might be that one has to build such a construct via the application program, which might be rather complex.

4.3) geodb_coordinates is still beta:

4.3.1) This table is currently quite huge due to many derived values like sin_lon etc. pp.. The idea was to enhance query performance, e.g., when querying the distance between locations by using these ready-made sine and cosine values.

In praxis, it does not really look like this goal will be reached by this method, as it blows up the tables substantially (by this requiring more reads from the harddisk, which is REALLY slow in comparison).

This is the reason, why I'm thinking about removing these four attributes from geodb_coordinates.

4.3.2) The attribute coord_type contains the type of the coordinates, in this case WGS84. This is meaningful ONLY when one allows for other coord_types as well, e.g. UTM or whatever. On the other hand, these other coord_types would again just be derived values without giving you any additional information.

For this I'm thinking about removing the coord_type attribute from geodb_coordinates (and renaming coord_subtype to coord_type).

Powered by SourceForge® software project management solutions from VA Software
© Copyright 2005 - OSTG Open Source Technology Group, All Rights Reserved
About SourceForge.net  •  About OSTG  •  Privacy Statement  •  Terms of Use  •  Advertise  •  Get Support

IT Product Guide--enterprise IT product reviews